JOURNAL OF GEOPHYSICAL RESEARCH, VOL. ???, XXXX, DOL10.1029/, 



A homogeneous sunspot areas database covering 
more than 130 years 



L. A. Balmaceda 

" S. K. Solanki 
O ' 

o : 

(N . N. A. Krivova 

c : 

)— 5 i Max-Planck-Institut fur Sonnensystemforschung - Max-Planck-Str. 2, 37191 
Katlenburg-Lindau, Germany 

^ ; S. Foster 

Space And Atmospheric Physics Group, Blackett Laboratory, Imperial 

a: 

College London, Prince Consort Road, SW7 2BZ United Kingdom 



> 
(N 

O 

o 

OS 

o 



X 

L. A. Balmaceda, current affiliation: GACE/IPL - Universidad de Valencia, PO BOX 22085, 
46071 Valencia, Spain (laura.balmaceda@uv.es) 



DRAFT 



June 4, 2009, 4:26pm 



DRAFT 



X - 2 BALMACEDA ET AL.: A HOMOGENEOUS SUNSPOT AREAS DATABASE 

Abstract. 

The historical record of sunspot areas is a valuable and widely used proxy 
of solar activity and variability. The Royal Greenwich Observatory (RGO) 
regularly measured this and other parameters between 1874 and 1976. Af- 
ter that time records from a number of different observatories are available. 
These, however, show systematic differences and often have significants gaps. 
Our goal is to obtain a uniform and complete sunspot area time series by 
combining different data sets. A homogeneus composite of sunspot areas is 
essential for different applications in solar physics, among others for irradi- 
ance reconstructions. Data recorded simultaneously at different observato- 
ries are statistically compared in order to determine the intercalibration fac- 
tors. Using these data we compile a complete and cross-calibrated time se- 
ries. The Greenwich data set is used as a basis until 1976, the Russian data 
(a compilation of observations made at stations in the former USSR) between 
1977 and 1985 and data compiled by the USAF network since 1986. Other 
data sets (Rome, Yunnan, Catania) are used to fill up the remaining gaps. 
Using the final sunspot areas record the Photometric Sunspot Index is cal- 
culated. We also show that the use of uncalibrated sunspot areas data sets 
can seriously affect the estimate of irradiance variations. Our analysis im- 
plies that there is no basis for the claim that UV irradiance variations have 
a much smaller influence on climate than total solar irradiance variations. 
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1. Introduction 

The total area of all sunspots visible on the solar hemisphere is one of the fundamental 
indicators of solar magnetic activity. Measured since 1874, it provides a proxy of solar 
activity over more than 130 years that is regularly used, e.g., to study the solar cycle or to 
reconstruct total and spectral irradiance at earlier times [e.g., Brandt et al, 1994; Solanki 
and Fligge, 1998; Li, 1999; Li et al, 2005; Preminger and Walton, 2005; Krivova et al, 
2007]. Consequently, a reliable and complete time series of sunspot areas is essential. 
Since no single observatory made such records over this whole interval of time, different 
data sets must be combined after an appropriate intercalibration. In this sense, several 
comparative studies have been carried out in order to get an appropriately cross-calibrated 
sunspot area data set [see, for instance: Hoyt et al, 1983; Sivaraman et al, 1993; Fligge and 
Solanki, 1997; Baranyi et al, 2001; Foster, 2004, and references therein]. They pointed 
out that differences between data sets can arise due to random errors introduced by the 
personal bias of the observer, limited seeing conditions at the observation site, different 
amounts of scattered light, or the difference in the time when the observations were made. 
Systematic errors also account for a disparity in the area measurements. They are related 
to the observing and measurement techniques and different data reduction methods. For 
example, areas measured from sunspot drawings are on average smaller than the ones 
measured from photographic plates [Baranyi et al, 2001]. 

In this work, we compare data from Russian stations and the USAF (US Air Force) 
network as well as from other sources (Rome, Yunnan, Catania) with Royal Greenwich 
Observatory (RGO) data. This combination provides a good set of observations almost 



DRAFT 



June 4, 2009, 4:26pm 



DRAFT 



X - 4 BALMACEDA ET AL.: A HOMOGENEOUS SUNSPOT AREAS DATABASE 

free of gaps after 1976. Combining them appropriately improves the sunspot area time 
series available at present. 

In Section 2 we describe the data provided by the different observatories analyzed here. 
The method to calculate the apropriate cross-calibration factors is explained in Section 3.1. 
The results of the different comparisons are presented in Section 4. In Section 5 we discuss 
one central application of sunspot areas: solar irradiance reconstructions. When sunspots 
pass across the solar disc, a noticeable decrease in the measured total solar irradiance is 
observed. This effect can be quantified by the photometric sunspot index [Willson et al, 
1980; Foukal, 1981; Hudson et al, 1982]. This index depends on the positions of the 
sunspots on the visible solar disc and on the fraction of the disc covered by the spots, 

1. e. on the sunspot area. It is thus clear that appropriately cross-calibrated sunspot areas 
are required for accurate reconstructions of solar irradiance [see Frohlich et al, 1994; 
Fligge and Solanki, 1997]. We compare results when raw and calibrated data are used to 
calculate this index and total solar irradiance in Section 6. Finally, Section 7 presents the 
summary and conclusions. 

2. Observational data 

Data from RGO provide the longest and most complete record of sunspot areas. The 
data were recorded at a small network of observatories (Cape of Good Hope, Kodaikanal 
and Mauritius) between 1874 and 1976, thus covering nine solar cycles. Heliographic 
positions and distance from the central meridian of sunspot groups are also available. 

The second data set is completely independent and was published by the Solnechniye 
Danniye (Solar Data) Bulletin issued by the Pulkovo Astronomical Observatory. The 
data were obtained at stations belonging to the former USSR. These stations provided 
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sunspot areas corrected for foreshortening, together with the heliographic position (lati- 
tude, longitude) and distance from center of solar disc in disc radii for each sunspot group. 
We will refer to this data set as Russian data. 

After RGO ceased its programme, the US Air Force (USAF) started compiling data 
from its own Solar Optical Observing Network (SOON). This network consists of solar 
observatories located in such a way that 24-hour synoptic solar monitoring can be main- 
tained. The basic observatories are Boulder and the members of the network of the US Air 
Force (Holloman, Learmonth, Palehua, Ramey and San Vito). Also, data from Mt. Wil- 
son Observatory are included. This programme has continued through to the present with 
the help of the US National Oceanic and Atmospheric Administration (NOAA). This data 
set is referred to by different names in the literature, e.g., SOON, USAF, USAF/NOAA, 
USAF/Mt. Wilson. In the following, we will refer to it as SOON. 

Usually multiple measurements are provided for a given sunspot group n on a particular 
day d coming from different SOON stations, up to a maximum of 6 if all the stations 
provided information. Normally, at least three values are listed. In order to get a unique 
value for this group, we calculate averages of sunspot areas recorded on this day d (A ntd ), 
including only those values which fulfill the following condition: 
A n ,d - 2 • o A < A njd < A n)d + 2 ■ <7A- 

Here, A n<d is the mean value of all the areas measured for the group n on the day d, and 
<ja is their standard deviation. The value of a a varies from group to group, depending 
on their sizes and on the time of the solar cycle. Note that, by using this condition we 
intend to exclude outliers, i.e. those areas whose values differ greatly from the mean for 
each group. In those cases where the number of stations providing data is 1, the area for 
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that group is taken from this single source. If the number is 2, the area for that group is 
the mean of the areas measured by both stations. 

After that, sunspot areas for individual groups are summed up to get the daily value. 
Also averaged are latitudes and longitudes of each sunspot group recorded by those ob- 
servatories whose data are employed to get the mean sunspot area. 

These three data sets (RGO, Russia and SOON) are the prime sources of data that 
we consider, since they are the most complete, being based on observations provided by 
multiple stations. A number of further observatories have also regularly measured sunspot 
areas during the past decades. The record from Rome Astronomical Observatory, whose 
measurements began in 1958, covers more than three consecutive and complete solar cy- 
cles. It has several years of observations in common with Russian stations and SOON as 
well as with RGO. This is perhaps the only source of data with a long period of overlap 
with all three prime data sets. The database from Rome is used to compare the results 
obtained from the other observatories and also to fill up gaps whenever possible. Un- 
fortunately, its coverage is limited by weather conditions and instrumentation problems. 
Whenever available, data from Yunnan Observatory in China and Catania Astrophysical 
Observatory in Italy are also used to fill up the remaining gaps. In Catania, daily drawings 
of sunspot groups were made at the Cooke refractor on a 24.5 cm diameter projected im- 
age from the Sun, while the measurements provided by the Chinese observatory are based 
on good quality white light photographs. Table 1 summarizes the information provided 
by each observatory: the period in which observations were carried out, the observing 
technique, the coverage (i.e. the percentage of days on which measurements were made) 
and the minimum area reported by each observatory. Areas corrected for foreshortening 
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are provided in all the cases. Directly observed or projected areas, can be derived using 
the heliographic positions for sunspot groups and hence heliocentric angle, 9, or /i-values 
(cos#). Striking is the relatively large minimum area considered by the SOON network. 
This suggests that many smaller sunspots are neglected in this record. 

All the data used in this work were extracted from the following website: 
http:/ /www. ngdc.noaa.gov/stp/SOLAR/ftpsunspotregions. html. 

3. Analysis 

3.1. Cross-calibration factors 

Daily sunspot areas from two different observatories are directly compared on each day 
on which both had recorded data. We deduced multiplication factors needed to bring all 
data sets to a common scale, namely that of RGO, which is employed as fiducial data set. 

For this, the spot areas from one data set are plotted vs the other (see left panels of 
Fig. 1). The slope of a linear regression forced to pass through the origin (see Appendix A) 
can be used to calibrate the sunspot area record considered auxiliary, A aux , to the areas 
of another basic data set, A bas : 



First, this analysis is applied to all the points. The slope thus obtained is taken to 
be the initial estimate for a second analysis where not all the points are taken into ac- 
count. Outliers are excluded by taking only points within 3a f a from the first fit, where 



a f it = y -^-j- [A^ as — b ■ Af ux J . Also, only areas lying above the line joining the 
points (0, 3a fa) and (3<7/ it , 0) are considered. Through this measure points close to the 
origin are excluded since they introduce a bias. 



has i _ Aaux 



(1) 
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Ordinary least-square regression cannot be applied in this case, for the following reasons: 
(1) the distinction between independent and dependent variables is arbitrary; (2) the data 
do not provide formal errors for the measurements; (3) the intrinsic scatter of the data 
may dominate any errors arising from the measurement procedure of sunspot areas. A 
method that treats the variables symmetrically should be used instead. 

To this purpose, the same procedure is repeated after interchanging the data sets taken 
as a basis and as auxiliary. For the reasons outlined in Appendix A, the inverse value of 
the slope now obtained, b', differs from the slope b obtained in the first place. Therefore, 
the final calibration factor is then calculated by averaging these two values: b and 1/b'. 
This method is referred to as "bisector line" [Isobe et ai, 1990]. 

An alternative method to find the calibration factors is described in Appendix B. This 
second method does not neglect the sunspot areas close to zero. In contrast, it gives equal 
weight to all values. The calibration factors obtained in this way are thus less accurate 
during high activity levels, when solar irradiance is most variable. Since the reconstruction 
of solar irradiance is a key application of the new cross-calibrated sunspot area record, 
we select the method described above rather than the one presented in Appendix B. Of 
course, for other applications, this method may happen to be more appropriate. Therefore, 
in Table 3 we also give the factors obtained in this way. The difference between the factors 
obtained by the 2 methods is generally less than 5%, although differences as large as 12% 
can be reached for factors deduced from corrected sunspot areas. 

Data series that do not overlap in time can be intercalibrated using the Zurich sunspot 
number as a common index [Fligge and Solanki, 1997; Vaquero et ai, 2004]. Since this 
approach requires an additional assumption, namely that the size distribution of sunspots 
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[Bogdan et al, 1988; Baumann and Solanki, 2005] remains unchanged over time we avoid 
using it for calibration purposes [Solanki and Unruh, 2004]. We use this comparison only 
for confirmation of the results obtained from the direct measurements, so that the new 
record is completely independent of the sunspot number time series. 

3.2. Error estimates 

A single calibration factor is calculated for the whole period of overlap between data 
sets obtained by two observatories. This is repeated once for the projected areas and 
those corrected for foreshortening provided by the different observatories. 

In some cases, however, the relation between two data sets was found to evolve with 
time. This can be seen in the right panels of Fig. 1, in which the 12-month running 
means of sunspot area records are plotted vs time. Both Figs. 1 and 2 show that even 
after cross-calibration the two data sets do not run in parallel but rather have systematic 
relative offsets over particular periods of time (lasting multiple years). Therefore factors 
for different sub-intervals are also calculated, in order to estimate the uncertainties of 
the final factors. This is performed by separating different solar cycles. When the whole 
interval of overlap does not cover more than one cycle, then the division is made when 
a change in the behaviour is observed. See, e.g., the comparison between RGO and 
Russian data in the upper right panel of Fig. 1, where the change takes place after year 
1971. Before that year, Russian areas are on average smaller than those from RGO, 
while the situation is reversed afterwards. The uncertainty in the final factors is thus the 
combination of the uncertainties due to the cycle-to-cycle variations (different factors for 
different cycles and/or subintervals) , a cyc , the difference between b and b', o~dif, and the 
errors, a s i ope , in determining the slopes so that: a = J<y cyr ? + Odi/ 2 + cr s i ope 2 . The main 
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source of uncertainties being the fact that the relationship between two given observatories 
during the period they overlap is not uniform. Therefore, the smallest errors are obtained 
when this period is short (see, e.g., Russia - Catania, SOON - Catania in Table 2). On the 
other hand, the largest errors are found in the comparison between Russia and Yunnan. 
This is discussed in more detail in the next Section. 

4. Results and discussion 

4.1. Comparison between sunspot areas 

The results of the analysis described in Section 3 are summarized in Table 2. The first 
two columns give the names of the data sets being compared. The observatories whose 
data are taken as the basis are indicated as Obs. 1, while the observatories whose data 
are recalibrated are indicated as Obs. 2. The third column shows the interval of time 
over which they overlap. In the next two columns we list the calibration factors by which 
the data of Obs. 2 have to be multiplied in order to match those of Obs. 1. The factors 
for the originally measured areas (projected areas, PA) and for the areas corrected for 
foreshortening (CA) are given, in columns 5 and 6, respectively. The two last columns 
list the corresponding correlation coefficients between the two data sets. 

With one exception the correlation coefficients for the projected areas are larger than for 
the ones corrected for foreshortening. This is not unexpected, since errors in the measured 
position of a sunspot increase the scatter in the areas corrected for foreshortening, while 
leaving the projected areas unaffected. 

In the following we discuss the results in greater detail. The overlap between RGO 
and Russian data covers the descending phase of cycle 21. As can be seen from Figs, la 
and b the two sets agree rather well with each other. The cross-calibration factor is very 
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close to unity, although the difference between the two data sets displays a trend with 
time. Before 1971, areas from RGO are larger (6% for projected, 8% in case of corrected 
for foreshortening) than Russian measurements, whereas after that time areas from the 
Russian data set are 8% larger (see Fig. lb) for both, projected and corrected areas. 
This trend remains also after recalibrating the Russian data, because a single factor is 
not sufficient to remove this effect. Since it is not clear which (or both) of these two data 
sets contains an artificial drift, we do not try to correct for it. 

Russian and SOON areas display more significant differences (see Fig. 1 c and d). 
The overlap covers the period from 1982 to 1991, or cycles 21 and 22. During the whole 
time interval, SOON areas appear to be smaller (by on average 40% for projected and 
45% for the corrected ones) than those of the Russian data. This is mainly due to the 
significant difference in the minimum value of the counted sunspots (1 ppm of the solar 
hemisphere for Russian vs 10 ppm of the solar hemisphere for SOON observations, see 
Table 1). As can be seen from Fig. Id, data from these two records also do not run 
in parallel, exhibiting quite a significant trend relative to each other (compare solid and 
dashed curves). 

In general, it was found that areas measured by the SOON network as well as those by 
the Rome, Catania and Yunnan observatories are on average smaller than areas reported 
by RGO and Russian stations, which agrees with the fact that the minimum areas of 
individual spots included into these two records are the smallest. For the same reason, 
SOON areas are smaller on average than the measurements from other data sets: the 
minimum area of the recorded spots is a factor of 3 to 10 higher for SOON than for the 
other observatories. 
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The last three lines of Table 2 give the factors by which SOON, Catania and Yunnan 
data need to be multiplied in order to match the RGO data. Since none of these data sets 
overlap with RGO we have used the Russian data as intermediary. Of course, correlation 
coefficients can not be determined in this case. The factor needed to calibrate SOON data 
to the RGO data set is 1.43 for projected areas, in good agreement with the results by 
Hathaway et al. [2002] and Foster [2004], who both give 1.4. In the case of areas corrected 
for foreshortening the factor found here is ~7% larger, being 1.49. 

4.2. Comparison with sunspot number 

The relationship between the Zurich relative sunspot number, R z , and sunspot area 
(from a single record) shows a roughly linear trend with a large scatter. In Fig. 2 we plot 
sunspot areas corrected for foreshortening, As, for RGO measurements vs R z . We have 
chosen A s from RGO since this is the longest running data set. The plus signs represent 
data points binned in groups of 50. These points indicate that the relationship is roughly, 
but not exactly linear. In particular at low R z values, A s appears to be too small, possibly 
because of the cutoff in the A s measurements. However, this behaviour may reflect also 
the particular definition of R z = k(10g + s), where g is the number of sunspot groups, s 
the total number of distinct spots and k the scaling factor (usually < 1) which depends 
on the observer and is introduced in order to keep the original scale by Wolf [Waldmeier, 
1961]. In this definition, even a small group of sunspots is given a nearly equally large 
weight as a large group. It is observed from this plot that a given value of R z corresponds 
to a range of values of sunspot areas. However, the scatter due to points within a single 
cycle is larger than the scatter from cycle to cycle. 
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When studying the relationship between As and R z for individual cycles, it was observed 
that in some cases the scatter is significantly higher. In such cycles, large areas are 
observed while R z remains low. In particular, the shape of A s cycles resembles that of R z 
cycles, but individual peaks are more accentuated in As- This could be also a consequence 
of the definition of R z , regarding the large weight given to the groups. Fligge and Solanki 
[1997] already showed that, in general, the relationship between A s and R z changes only 
slightly from one cycle to the next, with the difference being around 10%. 

Fig. 3 shows the comparison between RGO and SOON sunspot areas corrected for 
foreshortening with the sunspot number. We have binned the data from each observatory 
every 50 points according to the sunspot number. The uncalibrated areas from SOON lie 
significantly below the ones from RGO. After multiplying SOON data by the calibration 
factor of 1.49 found in Sect. 4.1, they display practically the same relationship to R z as 
the RGO data. 

4.3. Cross-calibrated sunspot area records 

In a next step we create records of projected and corrected sunspot areas covering the 
period from 1874 to 2008 that are consistently cross-calibrated to the RGO values. We 
use RGO, Russia and SOON measurements as the primary sources of data. As shown 
by Table 1, these sources provide the sets of sunspot area measurements, with the least 
number of gaps. 

The individual periods of time over which each of these is taken as the primary source 
are: 
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1874 - 1976 RGO 
1977 - 1986 Russia 
1987- 2008 SOON 

The final sunspot area composite is plotted in Fig. 4 (solid curve), and is tabulated 
in Table 4 (only available electronically). We have chosen to use the Russian data set 
until 1986 for the simple reason that this year corresponds to the solar minimum. In this 
way, each data set describes different solar cycles (see Fig. 4). We are aware that this is 
only aproximately correct since sunspots from consecutive cycles overlap during a short 
period of time, but this is a second order effect. In this combination we opt to multiply 
the post-RGO measurements by the factors obtained here since RGO areas data set is by 
far the longest running and relatively homogeneous source. Any data gaps in the primary 
source are filled using data from one of the other two primary records (if available), or 
data from Rome and Yunnan, properly recalibrated. The two last-named series allowed 
us to fill up the gaps over a total of 115 days. In this way, gaps in the final composite 
cover only ~ 8% of the total length of the combined data set of 49308 days. 

5. The Photometric Sunspot Index 

The passage of sunspots across the solar disc causes a decrease in the total solar irra- 
diance. This effect can be quantified by estimating the photometric sunspot index, P$, 
[Hudson et al, 1982]. First, the deficit of radiative flux, AS S , due to the presence of a 
sunspot of area As is calculated as: 



AS S _fiA s (C s -l)(Sfi + 2) 



(2) 
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This value is expressed in units of Sq, the solar irradiance for the quiet Sun (i.e. solar 
surface free of magnetic fields). Sq = 1365. 5W/m 2 is taken from the PMOD composite 
of measured solar irradiance [Frohlich, 2003, 2006]. We use the areas composite obtained 
here, As, and the heliocentric positions, //, of the sunspots present on the solar disc. The 
residual intensity contrast of the sunspot relative to that of the background photosphere 
Cs — 1 is taken from Brandt et al. [1992]. It takes into account the dependence of the 
sunspot residual intensity contrast on sunspot area, i.e., larger sunspots are darker than 
smaller spots, as has recently been confirmed on the basis of MDI data by Mathew et al. 
[2007]. Following Brandt et al. [1992, 1994] and Frohlich et al. [1994] we use: 

C s - 1 = 0.2231 + 0.0244 -\og(A s ). (3) 

Finally, summing the effects from all the sunspots present on the disc we obtain: 

i=i \ b Q Ji 

Figure 5 shows the 12- mo nth running mean time series of the Ps index for the period 
1874 - 2008. The daily P s values are also listed in Table 4 (available electronically). 

6. An example of errors introduced by an uncritical use of uncalibrated 

sunspot areas data sets 

Variations of solar irradiance on time scales longer than approximately a day are caused 

by the passage of dark sunspots and bright faculae across the solar disc. Due to the 
different wavelength dependences of their contrasts, the contribution of faculae is higher 
in the UV than in the visible or IR, whereas the contribution of sunspots dominates 
increasingly with increasing wavelength [Solanki and Unruh, 1998; Unruh et al, 1999]. 
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Thus employment of a faulty or inconsistent sunspot or faculae time series to reconstruct 
solar total and UV irradiance can lead to systematic differences between them. 

Now, it has been claimed that variations of solar UV irradiance are less important for 
climate than variations of solar total irradiance, S, [Foukal, 2002; Foukal et al, 2006]. 
These results are based on uncalibrated sunspot areas including both the Greenwich and 
the SOON data sets. Here we show that when sunspot areas after appropriate intercal- 
ibration as desribed in Section 3.1 are employed, total and UV solar irradiance behave 
similarly. 

We redo the analysis of Foukal [2002], but employing the cross-calibrated time series of 
sunspot areas obtained here. For the facular contribution we employ the same proxy as 
Foukal [2002], a monthly mean time series of plage plus enhanced network areas, Ap^. 
This data set was kindly provided by P. Foukal. Areas were measured from spectrohe- 
liograms and photoheliograms in the K-line of Ca II obtained at Mt. Wilson, McMath- 
Hulbert and Big Bear observatories in the period 1915 - 1984 [Foukal, 1996, 1998]. Later, 
this time series was extended until 1999 using data from Sacramento Peak Observatory 
(SPO). The data cover the period August 1915 - December 1999 inclusive. The identifi- 
cation of plages and enhanced network was performed by several observers. Details about 
the reduction procedure to derive the A PN index can be found in Foukal [1996]. A PN 
values are expressed in fractions of the solar disc. 

Total and UV solar irradiance time series are reconstructed following Foukal [2002]. 
According to that approach, enhancements in total solar irradiance are proportional to the 
difference in plage, A PN , and sunspot areas, As, whereas enhancements in UV irradiance 
are proportional to the plage areas alone. As a first step, residuals of solar irradiance 
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after removing the sunspot darkening, S — Ps, are calculated for the time when irradiance 
measurements are available, i.e. from 1978 till present. This quantity, S — Ps, is a measure 
of facular contribution to the total irradiance. Total solar irradiance measurements, S, are 
taken from the PMOD composite derived from different instruments with best allowance 
for their degradation and inter-calibration [Frohlich, 2000, 2006]. Then, a regression 
relation of the form: S — Ps = b ■ Ap^ + a is constructed between the monthly mean 
values of these residuals and of the plage areas, Ap^. This regression relation is then used 
to reconstruct the residuals (S — Ps)rec between 1915 and 1999 when values of A PN are 
available. The reconstructed total solar irradiance is finally obtained by just adding back 
the time series of Ps over this period. 

Figure 6 shows the 11-yr running means of the reconstructed total irradiance using 
calibrated (thick dotted line) and non-calibrated (thick dashed line) data. The thin lines 
represent the 1-yr means of both reconstructions. The curves were scaled in order to 
highlight the difference in the upward trend after 1970. The dashed curve represents the 
UV irradiance, i.e., the solar flux at wavelengths shorter than 250 nm. Its variability is 
determined mainly by the bright magnetic plages in active regions and enhanced network 
produced as these regions decay. Its reconstruction follows the same steps as of the total 
solar irradiance, except that the last step (adding back the Ps) is not carried out. 

The total irradiance reconstructed by Foukal [2002], which is very similar to the grey 
curve in Fig. 6, shows a clear upward trend after the year 1976 due to the strong presence 
of faculae that is not balanced by increased sunspot area. The UV irradiance does not 
display such a prominent rise, however. This result was interpreted by Foukal [2002] as 
evidence for a strongly different behaviour of the total irradiance and UV irradiance and 
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consequently their very different inffuence on the Earth's climate. In particular, the fact 
that the TSI correlates much better with global climate than the UV irradiance during 
the last three decades led Foukal [2002] to propose that UV irradiance influences global 
climate less than total irradiance. However, we find here that this behaviour is no longer 
observed when appropriately calibrated areas are used. The shape of the total irradiance 
estimated from calibrated data now follows closely the shape of the variation in Apn, 
i.e. the UV irradiance [cf. Solanki and Krivova, 2003]. It is not by chance that the two 
reconstructions of S start to diverge in ~1976 since at that time the record of A s from 
RGO ends. 

We stress that the simple approach used here to reconstruct total and UV solar irradi- 
ance has shortcomings. One concerns the Ap^ time series, which is based on uncalibrated 
spectroheliograms. Film calibration in photographic plates and variable image quality are 
some of the factors that introduce uncertainties in the extraction of the features and need 
to be taken into account. They affect the correct identification of different features in the 
Call K images which is based on criteria of decreasing intensity, decreasing size or de- 
creasing filling factors [Worden et al, 1998]. Another concerns the simplicity of the model 
assumed here, which succesfully reproduces the cyclic variation but does not contain a 
secular trend, unlike more detailed and complete recently developed models, for instance: 
Foster [2004]; Wang et al. [2005]; Krivova et al. [2007]. Such a secular trend can be pro- 
duced by long-term changes in the network, which is only poorly sampled by the Apn data 
employed here. These shortcomings have no influence on the drawn conclusions, however. 
It is not the aim of this section to produce realistic records of total and UV irradiance, 
but rather to demonstrate the importance of using a carefully cross-calibrated sunspot 
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areas time series. In particular, our conclusion that total solar irradiance shows no strong 
upward trend in three decades since 1976 is supported by the irradiance composite of 
Frohlich [2000, 2006] and the modelling work of Wenzler et al. [2006]. 

7. Summary and conclusions 

In this work, we have compared sunspot areas measured at different observatories. 
We found a good agreement between sunspot areas measured by Russian stations and 
RGO, while a comparison of sunspot areas measured by the SOON network with Russian 
data shows a difference of about 40% for projected areas and 44% in areas corrected for 
foreshortening. This is at least partly due to the different minimum areas of sunspots 
taken into account in these data sets: smallest areas included in the RGO and Russian 
records are 10 times smaller than those in the SOON series (see Table 1). Histograms 
of sunspot areas show that such small sunspots are rather common [Bogdan et al, 1988; 
Baumann and Solanki, 2005]. SOON sunspot areas are combined with those from RGO 
and Russia by multiplying them by a factor of 1.43 in the case of projected areas and 
1.49 in the case of areas corrected for foreshortening. Data from other observatories are 
employed to fill up some of the remaining gaps. In this manner, a consistent sunspot area 
database is produced from 1874 to 2008. 

A properly cross-calibrated sunspot areas data set is central for, e.g., reliable reconstruc- 
tions of total and spectral solar irradiance. In order to demonstrate this, we have also 
presented a simple reconstruction of total and UV solar irradiance based on sunspot and 
plages plus enhanced network areas for the period 1915 - 1999. We showed that the use 
of data of different sources directly combined, without a proper cross-calibration can lead 
to significantly erroneous estimates of the increase of solar irradiance in the last decades. 
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This means in particular that the claim of Foukal [2002] that UV solar irradiance is far 
less effective in driving climate change than total solar irradiance has no basis. 
Data from additional observatories, such as Debrecen Observatory in Hungary [Gyori 
et al, 1998, 2000], will help to improve the sunspot areas record even further. Another 
interesting possibility not explored here would be the comparison with data from space- 
borne observations, which are unaffected by seeing. SOHO/MDI [Scherrer et al, 1995] 
provides continuous data free of atmospheric effects since 1996 till present. Gyori et al. 
[2005] and Gyori and Baranyi [2006] have presented a comparison between areas mea- 
sured by Debrecen Observatory and MDI for 1996 and 1997. After applying the same 
procedure for determining sunspot areas to both data sets, they found that MDI areas 
are 17 % larger. They attribute this difference to the smaller scale of MDI images, with 
respect to that of Debrecen data. Wenzler [2005] , on the other hand, compared umbrae 
and penumbrae areas derived from continuum images taken at the Kitt Peak Observatory 
(KP) and MDI. From the analysis of 24 selected days at different levels of solar activity 
between 1997 and 2001, he obtained almost identical values for locations and areas for 
both data sets by applying an appropriate threshold. He also compared total daily KP 
sunspot areas and the composite presented here. The comparison showed that SPM areas 
are about 4% lower for the period 1992-2003 (2055 days). This shows that it is possible to 
combine ground-based and space-based measurements of sunspot areas into a single time 
series. 

Appendix A: On the effect of including offset in the calculation of cross- 
calibration factors 
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Let us consider sunspot areas recorded by two different observatories, Obs. 1 and 
Obs. 2, during the same period. Let b be the slope of the linear regression when the area 
recorded by Obs. 2 is the independent variable and b' the slope when the area recorded 
by Obs. 1 is the independent variable. In the ideal case, b = 1/b'. However for real data 
sets this is not true. There are two reasons for this. Firstly, since sunspot areas cannot 
be negative, values close to zero introduce a bias into the regression coefficients. As a 
result, the slopes we obtain including an offset (dashed lines in Fig. 7) are typically lower 
than the ones obtained by considering no offset (solid lines in Fig. 7). In particular, the 
obtained b is always lower than 1/b', whereas b' is lower than 1/b. In order to overcome 
this, we force the fit to go through the origin (solid lines in Fig. 7). The corresponding 
slopes typically increase, such that values of b and 1/b' become closer to each other, al- 
though they still differ. Secondly, when carrying out a linear regression to the relationship 
between the observatories, we assume measurements by one of them to be free of errors, 
whereas in reality both records are subject to errors. This immediately produces different 
regressions depending on which data set is plotted on the ordinate. This is well illustrated 
by comparing the encircled data point in Figs. 7a and b (it corresponds to the same data 
point in both). In Fig. 7a, the point significantly lowers the regression slope, since there 
are hardly any data points at that location of the x-axis, while in Fig. 7b its influence is 
small, since it now lies at a well populated part of the x-axis. By removing such outliers, 
we further reduce the difference between b and 1/b', but they are still not identical for 
purely statistical reasons. Therefore, as final factors we take the average between b and 
1/b' [Isobe et al, 1990]. 
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A more complicated case is the one when there is a significant offset between Obs. 1 
and Obs. 2, for example due to the difference in the minimum area of the considered 
spots (see, e.g. Fig. 7c and Table 1). In this case it may happen that b is not lower than 
1/b'. Then the slopes obtained by forcing the fit to go through zero do not necessarily 
improve the original ones. However, we apply the same procedure, neglecting the offset, 
for the following reasons: (1) the magnitude of the offset is rather uncertain due to the 
bias introduced by the positivity of the sunspot areas; (2) in doing this we may introduce 
some errors mainly at low values of sunspot areas, whereas values obtained during high 
activity levels which are of higher priority here are on average relatively reliable; (3) the 
real slope still lies in the range [6, 1/6'] (or [6', 1/6]) so that an average of b and 1/b' (or b' 
and 1/b) is a good approximation and, finally, (4) there are only few such cases (like the 
comparisons between areas from Russia and Rome and from Rome and SOON shown in 



An additional possible reason for the difference between b and 1/b' may be that the true 
relationship is non-linear. However, the scatter in the data is too large to reach any firm 
conclusion on this. 

Appendix B: An alternative method to calculate cross-calibration factors 

In addition to the method described in Section 3.1 and Appendix A to cross-calibrate 
different sunspot area data sets, we also performed the cross-calibration by varying a 
parameter / (defined below) in order to minimize a merit function A4, calculated over the 
A-days on which both A bas and A aux are available: 
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The merit function is used here since due to the lack of individual errors for daily 
measurements the classical definition of x 2 cannot be applied. 

In order to find the absolute minimum of M. irrespective of the presence of any 
secondary minima, a genetic algorithm called Pikaia is used [Charbonneau, 1995, 
http:/ /www. hao.ucar.edu/public/research/pikaia/pikaia. html]. 

In Fig. 8 we show the comparison between data from SOON and Rome, which overlap 
for a long period of time. A 12-month running mean of the original data vs time (upper 
panel) as well as the difference between the data from the two observatories, for both 
original and calibrated data (lower panel), are shown. 

In Table 3, values of the calibration factors for projected sunspot areas and for areas 
corrected for foreshortening obtained using this technique are listed. The corresponding 
values for M. are also tabulated. In all cases, these factors are lower than the ones found 
as explained in Section 3.1 and Appendix A. Note, however that if we first form (monthly 
or yearly) running means of A bas and A aux before minimizing Ai we obtain calibration 
factors much closer to those listed in Table 2. This has got to do with the fact that outliers 
are given a much smaller weight when forming running means than if taking the squared 
difference between daily data. 

This technique differs from the one discussed in Section 3.1 and Appendix A in that 
here the same weight is given to maximum and minimum phases of solar cycle. It can 
be seen from Fig. 8, that after calibration the difference between both data sets is very 
close to zero during activity minimum. However, during times of high solar activity this 
calibration technique does not give as accurate results. 
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As mentioned before, one of the most important applications of sunspot areas data sets 
is irradiance reconstruction. So we intend to produce a homogeneus and as complete as 
possible time series of sunspot areas that can be used in irradiance models to describe ad- 
equately the variations. Since sunspot contribution to these variations is most important 
during times of high activity, a method giving larger weight to periods of high activity 
(large spot areas) should provide a more appropriate calibration factor. For this reason, 
we use the factors obtained with the method explained in Section 3.1 and Appendix A as 
the default. Note, however, that in almost all cases factors obtained by the two methods 
agree within the given uncertainties (even if Eq. Bl is applied to daily data, without first 
forming running means). 
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Fi gUXG 1 . Comparison of sunspot areas corrected for foreshortening obtained by different observatories. Top: RGO 
vs Russia, bottom: Russia vs SOON. Left: scatter plots. Solid lines represent linear regressions to the data neglecting a 
possible offset (i.e., forced to pass through zero) as well as data points close to the origin and the outliers lying outside the 
±3(7 interval from the fit. Right: 12-month running means of sunspot areas vs time. Solid curves show the data used as 
basis level, dotted are the data from the second observatory and dot-dashed the calibrated areas. 
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Figure 2. Sunspot areas corrected for foreshortening vs sunspot number, R z , for measurements made by RGO. The 
dots represent daily values between 1874 and 1976. Each '+' symbol represents an average over bins of 50 points. 
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Figure 3. Sunspot areas corrected for foreshortening vs Zurich relative sunspot number, R z , for measurements 
made by RGO and SOON. Each symbol represents an average over bins of 50 points. The original data from SOON are 
represented by diamonds, and those after multiplication by a calibration factor of ~1.5 in order to match RGO data, by 
squares. 
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Figure 4. 12-month running means of sunspot areas corrected for foreshortening of the final composite using 
the factors given in Table 2 (solid curve). Also plotted are the Russian and SOON data entering the composite prior to 
calibration. 
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Figure 5. 12-month running mean of the photometric sunspot index, P$, computed using the sunspot areas 

composite produced here. The y-axis is expressed in percents of Sq, the irradiance of the quiet Sun. 
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Figure 6. 11 -yr running mean of total solar irradiance based on calibrated areas (dotted line), non-calibrated areas 
(dashed line) and UV irradiance given by the variations in the Ap^ (solid line). The thin lines represent 1-year means of 
solar irradiance based on calibrated areas (dotted line) and non-calibrated areas (dashed line). The y-axis is expressed in 
percents of Sq , the irradiance of the quiet Sun. 
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FigUFG T. Comparison between sunspot areas recorded by different observatories. Areas are in units of millionths of 
the solar disc. The lines represent linear regressions to the data: standard (dashed) and forced to pass through the origin 
(solid). The encircled data point is discussed in the text. 
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Figure 8. Upper panel: 12-month running means of projected sunspot areas. Areas are in units of millionths of 
the solar disc. They correspond to original values without any calibration. The solid line represents data from the Rome 
Observatory, the dashed line represents data from SOON. Lower panel: The solid line represents the 12-month running 
mean of the difference between data from these two observatories after calibrating the data. The dashed line is the difference 
between the original data from both observatories. 
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Table 1. Data provided by the different observatories 



Observatory Observation Period Observing technique Coverage Min. area reported 

[%] [ppm of solar hemisphere] 



RGO 


1874- 


-1976 


photographic plates 


98 


1 


Russia 


1968- 


-1991 


photographic plates 


96 


1 


SOON 


1981- 


-2008 


drawings 


98 


10 


Rome 


1958- 


-1999 


photographic plates 


50 


2 


Catania 


1978- 


-1987 


drawings 


81 


3 


Yunnan 


1981- 


-1992 


photographic plates 


81 


2 
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Table 2. Calibration factors for the different observatories 



X- 39 



Obs. 1 Obs. 2 Overlap Calibration Calibration Correlation Correlation 
Period Factor Factor Coefficient Coefficient 

PA CA PA CA 



RGO Russia f 968 - f 976 1.019 ± 0.067 1.028 ± 0.083 0.974 0.961 

Russia SOON 1982 - 1991 1.402 ±0.131 1.448 ±0.148 0.953 0.942 

RGO Rome 1958 - 1976 1.095 ±0.086 1.097 ±0.084 0.969 0.963 

Russia Rome 1968 - 1990 1.169 ±0.058 1.227 ±0.107 0.947 0.907 

SOON Rome 1982 - 1999 0.791 ±0.105 0.846 ±0.138 0.946 0.902 

Russia Yunnan 1968 - 1990 1.321 ±0.215 1.365 ±0.242 0.959 0.947 

SOON Yunnan 1982 - 1992 0.913 ±0.113 0.907 ±0.131 0.948 0.955 

Russia Catania 1978 - 1987 1.236 ± 0.052 1.226 ± 0.059 0.959 0.909 

SOON Catania 1982 - 1987 0.948 ± 0.042 0.925 ± 0.097 0.967 0.949 

RGO SOON via Russia 1.429 ±0.163 1.489 ±0.194 

RGO Catania via Russia 1.240 ±0.099 1.234 ±0.119 

RGO Yunnan via Russia 1.346 ± 0.237 1.403 ±0.273 
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Table 3. Calibration factors for projected and corrected sunspot areas measured by 
different observatories obtained by minimizing M. (see Appendix B) 



Obs. 1 Obs. 2 Calibration Calibration M M 
Factor Factor PA CA 
PA CA 



RGO Russia 1.014 

Russia SOON 1.352 

RGO Rome 1.083 

Russia Rome 1.121 

SOON Rome 0.764 

Russia Yunnan 1.279 

SOON Yunnan 0.895 

Russia Catania 1.205 

SOON Catania 0.924 



1.012 68.9 94.6 

1.399 136.1 242.7 

1.073 61.3 70.3 

1.106 220.5 185.9 

0.733 215.8 157.6 

1.326 111.5 130.6 

0.882 75.6 87.9 

1.197 202.1 164.5 

0.853 197.5 244.5 
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